Evaluating the Quality of Automatically Extracted Synonymy Information

نویسندگان

A. Kumaran

Ranbeer Makin

Vijay Pattisapu

Shaik Sharif

Lucy Vanderwende

چکیده

Automatic extraction of semantic information, if successful, offers to languages with little or poor resources, the prospects of creating ontological resources inexpensively, thus providing support for common-sense reasoning applications in those languages. In this paper we explore the automatic extraction of synonymy information from large corpora using two complementary techniques: a generic broad-coverage parser for generation of bits of semantic information, and their synthesis into sets of synonyms using automatic sense-disambiguation. To validate the quality of the synonymy information thus extracted, we experiment with English, where appropriate semantic resources are already available. We cull synonymy information from a large corpus and compare it against synonymy information available in several standard sources. We present the results of our methodology, both quantitatively and qualitatively, that indicate good quality synonymy information may be extracted automatically from large corpora using the proposed methodology.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Extraction of Synonymy Information:

متن کامل

Automatic Extraction of Synonymy Information: An Extended Abstract

متن کامل

Concept Extraction and Synonymy Management for Biomedical Information Retrieval

This paper reports on work done for the Genomics Track at TREC 2004 by ConverSpeech LLC in conjunction with scientists at the Saccharomyces Genome Database (SGD), the model organism database located at Stanford University, California. The rapidly increasing number of articles in the biomedical literature has created new urgency for software tools that find information relevant to specific infor...

متن کامل

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...

متن کامل

The Criteria for Evaluation of the Integration of Information and Communication Technology in the Curriculum: A Systematic Review

Objective: This study aimed to review the criteria for evaluating the integration of information and communication technology (ICT) in the curriculum, and given its significance, provide the necessary assessment recommendations. Material & Methods: This study was a theoretical-systematic review performed with keywords such as "integration," "evaluation," "Information and communication technolo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

LDV Forum

دوره 23 شماره

صفحات -

تاریخ انتشار 2008

Evaluating the Quality of Automatically Extracted Synonymy Information

نویسندگان

چکیده

منابع مشابه

Automatic Extraction of Synonymy Information:

Automatic Extraction of Synonymy Information: An Extended Abstract

Concept Extraction and Synonymy Management for Biomedical Information Retrieval

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

The Criteria for Evaluation of the Integration of Information and Communication Technology in the Curriculum: A Systematic Review

عنوان ژورنال:

اشتراک گذاری